30 research outputs found

    Making sense of real-world scenes

    Get PDF
    To interact with the world, we have to make sense of the continuous sensory input conveying information about our environment. A recent surge of studies has investigated the processes enabling scene understanding, using increasingly complex stimuli and sophisticated analyses to highlight the visual features and brain regions involved. However, there are two major challenges to producing a comprehensive framework for scene understanding. First, scene perception is highly dynamic, subserving multiple behavioral goals. Second, a multitude of different visual properties co-occur across scenes and may be correlated or independent. We synthesize the recent literature and argue that for a complete view of scene understanding, it is necessary to account for both differing observer goals and the contribution of diverse scene properties

    Direct comparison of contralateral bias and face/scene selectivity in human occipitotemporal cortex

    Get PDF
    Human visual cortex is organised broadly according to two major principles: retinotopy (the spatial mapping of the retina in cortex) and category-selectivity (preferential responses to specific categories of stimuli). Historically, these principles were considered anatomically separate, with retinotopy restricted to the occipital cortex and category-selectivity emerging in the lateral-occipital and ventral-temporal cortex. However, recent studies show that category-selective regions exhibit systematic retinotopic biases, for example exhibiting stronger activation for stimuli presented in the contra- compared to the ipsilateral visual field. It is unclear, however, whether responses within category-selective regions are more strongly driven by retinotopic location or by category preference, and if there are systematic differences between category-selective regions in the relative strengths of these preferences. Here, we directly compare contralateral and category preferences by measuring fMRI responses to scene and face stimuli presented in the left or right visual field and computing two bias indices: a contralateral bias (response to the contralateral minus ipsilateral visual field) and a face/scene bias (preferred response to scenes compared to faces, or vice versa). We compare these biases within and between scene- and face-selective regions and across the lateral and ventral surfaces of the visual cortex more broadly. We find an interaction between surface and bias: lateral surface regions show a stronger contralateral than face/scene bias, whilst ventral surface regions show the opposite. These effects are robust across and within subjects, and appear to reflect large-scale, smoothly varying gradients. Together, these findings support distinct functional roles for the lateral and ventral visual cortex in terms of the relative importance of the spatial location of stimuli during visual information processing. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00429-021-02411-8

    Low-level contrast statistics are diagnostic of invariance of natural textures

    Get PDF
    Texture may provide important clues for real world object and scene perception. To be reliable, these clues should ideally be invariant to common viewing variations such as changes in illumination and orientation. In a large image database of natural materials, we found textures with low-level contrast statistics that varied substantially under viewing variations, as well as textures that remained relatively constant. This led us to ask whether textures with constant contrast statistics give rise to more invariant representations compared to other textures. To test this, we selected natural texture images with either high (HV) or low (LV) variance in contrast statistics and presented these to human observers. In two distinct behavioral categorization paradigms, participants more often judged HV textures as “different” compared to LV textures, showing that textures with constant contrast statistics are perceived as being more invariant. In a separate electroencephalogram (EEG) experiment, evoked responses to single texture images (single-image ERPs) were collected. The results show that differences in contrast statistics correlated with both early and late differences in occipital ERP amplitude between individual images. Importantly, ERP differences between images of HV textures were mainly driven by illumination angle, which was not the case for LV images: there, differences were completely driven by texture membership. These converging neural and behavioral results imply that some natural textures are surprisingly invariant to illumination changes and that low-level contrast statistics are diagnostic of the extent of this invariance

    Evaluating the correspondence between face-, scene-, and object-selectivity and retinotopic organization within lateral occipitotemporal cortex

    Get PDF
    The organization of human lateral occipitotemporal cortex (lOTC) has been characterized largely according to two distinct principles: retinotopy and category-selectivity. Whereas category-selective regions were originally thought to exist beyond retinotopic maps, recent evidence highlights overlap. Here, we combined detailed mapping of retinotopy, using population receptive fields (pRF), and category-selectivity to examine and contrast the retinotopic profiles of scene- (occipital place area, OPA), face- (occipital face area, OFA) and object- (lateral occipital cortex, LO) selective regions of lOTC. We observe striking differences in the relationship each region has to underlying retinotopy. Whereas OPA overlapped multiple retinotopic maps (including V3A, V3B, LO1, and LO2), and LO overlapped two maps (LO1 and LO2), OFA overlapped almost none. There appears no simple consistent relationship between category-selectivity and retinotopic maps, meaning category-selective regions are not constrained spatially to retinotopic map borders consistently. The multiple maps that overlap OPA suggests it is likely not appropriate to conceptualize it as a single scene-selective region, whereas the inconsistency in any systematic map overlapping OFA suggests it may constitute a more uniform area. Beyond their relationship to retinotopy, all three regions evidenced strongly retinotopic voxels, with pRFs exhibiting a significant bias towards the contralateral lower visual field, despite differences in pRF size, contributing to an emerging literature suggesting this bias is present across much of lOTC. Taken together, these results suggest that whereas category-selective regions are not constrained to consistently contain ordered retinotopic maps, they nonetheless likely inherit retinotopic characteristics of the maps from which they draw information

    Spatially pooled contrast responses predict neural and perceptual similarity of naturalistic image categories.

    Get PDF
    The visual world is complex and continuously changing. Yet, our brain transforms patterns of light falling on our retina into a coherent percept within a few hundred milliseconds. Possibly, low-level neural responses already carry substantial information to facilitate rapid characterization of the visual input. Here, we computationally estimated low-level contrast responses to computer-generated naturalistic images, and tested whether spatial pooling of these responses could predict image similarity at the neural and behavioral level. Using EEG, we show that statistics derived from pooled responses explain a large amount of variance between single-image evoked potentials (ERPs) in individual subjects. Dissimilarity analysis on multi-electrode ERPs demonstrated that large differences between images in pooled response statistics are predictive of more dissimilar patterns of evoked activity, whereas images with little difference in statistics give rise to highly similar evoked activity patterns. In a separate behavioral experiment, images with large differences in statistics were judged as different categories, whereas images with little differences were confused. These findings suggest that statistics derived from low-level contrast responses can be extracted in early visual processing and can be relevant for rapid judgment of visual similarity. We compared our results with two other, well- known contrast statistics: Fourier power spectra and higher-order properties of contrast distributions (skewness and kurtosis). Interestingly, whereas these statistics allow for accurate image categorization, they do not predict ERP response patterns or behavioral categorization confusions. These converging computational, neural and behavioral results suggest that statistics of pooled contrast responses contain information that corresponds with perceived visual similarity in a rapid, low-level categorization task

    Your conflict matters to me! Behavioral and neural manifestations of control adjustment after self-experienced and observed decision-conflict

    Get PDF
    Contains fulltext : 90461.pdf (publisher's version ) (Open Access)In everyday life we tune our behavior to a rapidly changing environment as well as to the behavior of others. The behavioral and neural underpinnings of such adaptive mechanisms are the focus of the present study. In a social version of a prototypical interference task we investigated whether trial-to-trial adjustments are comparable when experiencing conflicting action tendencies ourselves, or simulate such conflicts when observing another player performing the task. Using behavioral and neural measures by means of event-related brain potentials we showed that both own as well as observed conflict result in comparable trial-to-trial adjustments. These adjustments are found in the efficiency of behavioral adjustments, and in the amplitude of an event-related potential in the N2 time window. In sum, in both behavioral and neural terms, we adapt to conflicts happening to others just as if they happened to ourselves.8 p

    Scene complexity modulates degree of feedback activity during object detection in natural scenes.

    No full text
    Selective brain responses to objects arise within a few hundreds of milliseconds of neural processing, suggesting that visual object recognition is mediated by rapid feed-forward activations. Yet disruption of neural responses in early visual cortex beyond feed-forward processing stages affects object recognition performance. Here, we unite these discrepant findings by reporting that object recognition involves enhanced feedback activity (recurrent processing within early visual cortex) when target objects are embedded in natural scenes that are characterized by high complexity. Human participants performed an animal target detection task on natural scenes with low, medium or high complexity as determined by a computational model of low-level contrast statistics. Three converging lines of evidence indicate that feedback was selectively enhanced for high complexity scenes. First, functional magnetic resonance imaging (fMRI) activity in early visual cortex (V1) was enhanced for target objects in scenes with high, but not low or medium complexity. Second, event-related potentials (ERPs) evoked by target objects were selectively enhanced at feedback stages of visual processing (from ~220 ms onwards) for high complexity scenes only. Third, behavioral performance for high complexity scenes deteriorated when participants were pressed for time and thus less able to incorporate the feedback activity. Modeling of the reaction time distributions using drift diffusion revealed that object information accumulated more slowly for high complexity scenes, with evidence accumulation being coupled to trial-to-trial variation in the EEG feedback response. Together, these results suggest that while feed-forward activity may suffice to recognize isolated objects, the brain employs recurrent processing more adaptively in naturalistic settings, using minimal feedback for simple scenes and increasing feedback for complex scenes

    Visual dictionaries as intermediate features in the human brain

    Get PDF
    The human visual system is assumed to transform low level visual features to object and scene representations via features of intermediate complexity. How the brain computationally represents intermediate features is still unclear. To further elucidate this, we compared the biologically plausible HMAX model and Bag of Words (BoW) model from computer vision. Both these computational models use visual dictionaries, candidate features of intermediate complexity, to represent visual scenes, and the models have been proven effective in automatic object and scene recognition. These models however differ in the computation of visual dictionaries and pooling techniques. We investigated where in the brain and to what extent human fMRI responses to short video can be accounted for by multiple hierarchical levels of the HMAX and BoW models. Brain activity of 20 subjects obtained while viewing a short video clip was analyzed voxel-wise using a distance-based variation partitioning method. Results revealed that both HMAX and BoW explain a significant amount of brain activity in early visual regions V1, V2 and V3. However BoW exhibits more consistency across subjects in accounting for brain activity compared to HMAX. Furthermore, visual dictionary representations by HMAX and BoW explain significantly some brain activity in higher areas which are believed to process intermediate features. Overall our results indicate that, although both HMAX and BoW account for activity in the human visual system, the BoW seems to more faithfully represent neural responses in low and intermediate level visual areas of the brain

    Contrast histograms of natural images follow a Weibull distribution.

    No full text
    <p>(<b>A</b>), Three natural images with varying degrees of details and scene fragmentation. The homogenous, texture-like image of grass (upper row) contains many edges of various strengths; its contrast distribution approaches a Gaussian. The strongly segmented image of green leaves against a uniform background (bottom row) contains very few, strong edges that are highly coherent; its distribution approaches power law. Most natural images, however, have distributions in between (middle row). The degree to which images vary between these two extremes is reflected in the free parameters of a Weibull fit to the contrast histogram: β (beta) and γ (gamma). (<b>B</b>), For each of 200 natural scenes, the beta and gamma values were derived from fitting the Weibull distribution to their contrast histogram. Beta describes the width of the histogram: it varies with the distribution of local contrasts strengths. Gamma describes the shape of the histogram: it varies with the amount of scene clutter. Four representative pictures are shown in each corner of the parameter space. Images with a high degree of scene segmentation, e.g. a leaf on top of snow, are found in the lower left corner, whereas highly cluttered images are on the right. Images with more depth are located on the top, whereas flat images are found at the bottom. Images are from the McGill Calibrated Colour Image Database <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002726#pcbi.1002726-Olmos1" target="_blank">[86]</a>.</p
    corecore